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Observations of distant bright quasars suggest that bilHon solar mass supermas- 
sive black holes (SMBHs) were already in place less than a billion years after 
the Big Bang^. Models in which light black hole seeds form by the collapse 
of primordial metal-free stars^'^ cannot explain their rapid appearance due to 
inefficient gas accretion^'^'^. Alternatively, these black holes may form by di- 
rect collapse of gas at the center of protogalaxies^'^'^. However, this requires 
metal-free gas that does not cool efficiently and thus is not turned into stars ^, 
in contrast with the rapid metal enrichment of protogalaxies^'^. Here we use a 
numerical simulation to show that mergers between massive protogalaxies nat- 
urally produce the required central gas accumulation with no need to suppress 
star formation. Merger-driven gas inflows produce an unstable, massive nu- 
clear gas disk. Within the disk a second gas inflow accumulates more than 100 
million solar masses of gas in a sub-parsec scale cloud in one hundred thousand 
years. The cloud undergoes gravitational collapse, which eventually leads to 
the formation of a massive black hole. The black hole can grow to a billion 
solar masses in less than a billion years by accreting gas from the surrounding 
disk. 



The conventional scenario postulates that light black hole seeds form from the collapse 
of an early generation of metal-free (Population III) stars^'^°'^^. Sustained accretion of such 
seed black holes at or above the Eddington rate is required to grow the billion solar masses 
SMBHs that are thought to power bright quasars at 2; ~ 6^. Yet numerical simulations 
show that the ionized gas surrounding the seeds has very low densities and high pressures 
that would prevent the required high accretion rates'^ . Radiative feedback from accretion 
and radiation pressure also limit the accretion rate to values well below Eddington^'^'^^. 
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Dynamical effects, sucli as tlie expulsion of a black hole from the host halo due to asymmet- 
ric emission of gravitational waves in mergers (the "gravitational rocket") can stifle further 
the growth of the seeds^'^^. 

Alternatively, massive black hole seeds exceeding lO'^M© might form directly from run- 
away gas collapse at the centers of protogalaxies^'^'^'^°'^^'^^. The accumulation of large 
amounts of gas in galactic nuclei can occur via efficient transport of angular momentum 
driven by gravitational torques in galactic disks^^'^^'^'', via viscous diffusion due to gravito- 
driven turbulence ^^'^^^ or by means of the magnetorotational instability 20,21^ Recent 
models suggest that under normal thermodynamical conditions of the interstellar medium 
gas would be converted into stars faster than it is driven to the center^^'^^. Instead, large 
inflow rates, > IMq/yt, could accumulate a large central gas concentration without frag- 
mentation into stars, leading eventually to direct SMBH formation, if molecular cooling 
and metal cooling are suppressed*^'^^'^^. However, even if molecular cooling can be sup- 
pressed at high rcdshift as H2, the most abundant molecule, is dissociated by the cosmic 
ultraviolet background^'^^, as soon as some metals are produced after the first generation 
of stars protogalaxies would still undergo rapid cooling and star formation, especially in 
the presence of dust^°. Therefore, at present it is unclear if protogalaxies ever met the 
conditions required for direct SMBH formation. 

Yet, current direct formation models are simple one-dimensional semi- analytical cal- 
culations that start from an initially stable, axisymmetric kiloparsec scale protogalactic 
gas disk, and perturb it under the assumption that the infiow occurs in a steady-state 
fashion^' Neither the initial stability and regular structure, nor the steady state con- 
dition of the infiow, are representative of galaxies at high redshift, which are subject to 
rapid gas accretion and repeated mergers. Moreover, due to their steady-state nature, these 
models cannot even demonstrate that gas inflows do really lead to a central collapse. 

The solution to the rapid build-up of SMBHs may be offered by galaxy mergers. In 
mergers tidal torques and shock dissipation are capable of driving most of the gas content 
of galaxies from kiloparsec scales down to scales of several tens of parsecs at rates as high as 
10 — lOOM0/yr, forming nuclear gas disks despite high star formation rates^^'^^. If such gas 
inflows could continue all the way down to the very center of the merger remnant at these 
high rates they would provide an alternative route to direct massive black hole formation^°. 
Addressing this issue requires a three-dimensional simulation following gasdynamics across 
an unprecedented range of spatial scales. 

We begin by performing a merger simulation between two identical disk galaxies with 
moderate amounts of gas^^, several kiloparsecs in size and embedded in a IO^^Mq halo 
(see Supplementary Information). In the concordance WMAP5 cosmology such objects 
would correspond to fairly high density peaks collapsing at 2; ~ 7 — 8^, (~ 4 — 5a, where 
a is the rms variation of the cosmological density field). Recent cosmological simulations 
show that massive, kiloparsec-scale rotating disks of stars and gas are already present in 
halos with masses > lO^^M© at 2; > 4^^. The mass of the dark halo of each galaxy is 
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Figure 1. Time evolution of the simulated nuclear disk. The surface density maps of the 
nuclear disk are shown at both large (upper panels) and small scales (bottom panels). Particles 
are color-coded on a logarithmic scale with brighter colors in regions of higher density. The density 
ranges from 2 x 10^ to 1 x 10^ Mq/pc^ and from 2 x 10^ to 2 x 10^^ Mq/pc^ in the upper and lower 
panels, respectively. The time increases from left to right, corresponding to 9.1 x 10^ yr, 7.49 x 10^ 
yr and 1.036 x 10^ yr after the merger. The time of the merger is defined as the time at which the 
two density peaks associated with the merging galactic cores are no longer distinguishable. For 
reference, the disk orbital time at 20 pc is 5 x 10*^ yr, while at 1 pc it is ~ 4 x 10^ yr. Global 
spiral modes, in particular the two-armed spiral initially triggered by the final collision between 
the two cores, are evident at scales of tens of parsecs (upper panels) and cause the mass increase 
in the central parsec region (bottom panels (a)-(b)) that allows the collapse into a massive central 
cloud (from bottom panel (b) to bottom panel (c). 
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consistent with that inferred for the hosts of high-redshift quasars based on their num- 
ber densities^'^^ . The two galaxies are placed on a parabolic orbit whose parameters are 
consistent with those found in cosmological simulations^^'^^. We employ the technique of 
smoothed particle hydrodynamics (SPH) to model the galaxy collision, including radiative 
cooling, star formation and a temperature floor at 2 x 10^ K to mimic non-thermal pressure 
due to turbulence in the interstellar medium (ISM) (see Supplementary Information). 

The two merging galaxies undergo two close encounters as dynamical friction against 
their extended halos erodes their orbital energy. At the time when the two baryonic cores 
are separated by 6 kpc, when they begin their last orbit before the final collision, we perform 
particle splitting in the gas component within a volume 30 kpc in size (See Supplementary 
Information). As a result, we increase our gas mass resolution eight- fold and, simultane- 
ously, we decrease the gravitational softening achieving a spatial resolution of 0.1 pc (see 
Supplementary Information). In this refined calculation we adopt an effective equation of 
state (EOS) that accounts for the net balance between cooling and heating^^. A lower res- 
olution un-refined simulation shows that during the final collision the star formation peaks 
at ~ 3OM0/yr over about lO^yr^^. Motivated by this, we chose an EOS with an effective 7 
in the range 1.1 — 1.4 which is appropriate for interstellar gas heated by a major starburst^^ 
and assuming solar mctallicity as suggested by the abundance analysis of the high-redshift 
QSOs^^ (see Supplementary Information). 

The final collision of the two galactic cores produces a massive turbulent rotating nuclear 
disk with a mass of ~ 2 xIO^Mq and a radius of about 80 pc^"^. With the increased resolution 
we can follow the internal evolution of the nuclear disk. The disk is born in an unstable 
configuration, with a prominent two-armed spiral pattern imprinted by the collision and 
sustained by its own strong self-gravity (Figure 1). The gas has a high turbulent velocity 
dispersion (cr ~ 100 km/s) maintained by gravitational instability^^ and rotates at a speed 
of several hundred km/s within 50 pc. It constitutes most of the mass in the nuclear region, 
the rest being in dark matter and stars. The spiral arms swiftly transport mass inward and 
angular momentum outward About only 10^ yr after the merger is completed, more 
than 20% of the disk mass (>~ 5 x IO^Mq) resides within the central few parsecs (Figure 2) 
where the inflow rate peaks at M > lO^Mg/yr (see Figure 1 of Supplementary Information) 
This corresponds to an inflow timescale tint — M/M < Torb, where Torb is the local orbital 
timescale. The inflow rate is orders of magnitude higher than that expected in isolated, thin 
locally gravitationally unstable disks considered in analytic models of direct massive black 
hole formation-'^^, but it is consistent with results of three-dimensional simulations of disks 
subject to global instabilities^^'^^ (see Supplementary Information). The self-gravitating 
nuclear disk does not fragment because of its high effective thermal pressure and even 
higher turbulent pressure, which maintain a Toomre Q parameter above the threshold for 
stability (see Supplementary Information). 

The gas funnelled to the central 2-3 pc region of the nuclear disk settles into a rotating, 
pressure supported cloud. The density of the cloud continuously increases as it gains mass 
from the inflow until it becomes Jeans unstable and collapses to sub-parsec scales on the 
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local dynamical time, t^y n ~ 10^ yr (Figure 1). The supermassive cloud contains ~ 13% 
of the disk mass (~ 2.6 x IO^Mq) (Figure 2). The simulation is stopped once the central 
cloud has contracted to a size comparable to the spatial resolution limit. At this point the 
cloud is still Jeans unstable. With greater resolution its collapse should continue since the 
equation of state would become essentially isothermal at even higher densities^''. With its 
steep density profile (r ~ p^"^, with 7 > 2) the cloud would be very massive and dense even 
at much smaller radii. At a radius ~ 10~^ pc it would match the conditions of a quasi-star 
that can then collapse directly into a massive black hole^'^°. Alternatively, it could collapse 
on a free-fall timescale into a massive black hole without prior formation of a quasi-star. 
Only ~ 10^ yr have elapsed since the completion of the merger, a timescale much shorter 
than the 10^ years needed to convert most of the nuclear gas into stars during the starburst 
(sec Supplementary Information). Therefore, in our merger simulation the gas flows inward 
and collapses into a compact object much faster than it can form stars, overcoming the 
major difficulty of previous direct formation models appealing to metal-free conditions in 
protogalaxies to suppress star formation^°'^^. 

The very high temperature of the cloud, T > 10^ K, makes star formation in its interior 
highly unhkely. Yet, even if less than 1% of the cloud mass collapsed into a black hole it 
would still produce an object of mass > IO^Mq which can subsequently grow by accreting 
the surrounding nuclear gas disk (sec Supplementary Information). Assuming that the 
disk forms stars with a ~ 30% efficiency, as deduced from observation and models of 
star forming molecular clouds^^, a gas mass in excess of lO^M© would still be available 
to feed the black hole over a timescale longer than 10^ yr, the duration of the starburst. 
Considering Eddington-hmited accretion^^ the black hole can grow to Mbh ~ IO^Mq in as 
little as 3.6 x 10^ yr (sec Supplementary Information). Therefore the massive seed black hole 
can grow fast enough for bright quasars to arise within the first billion year from the Big 
Bang, namely by 2; ~ 6. This provided that the galaxy merger also occurs on a timescale 
shorter than a billion year, a natural condition in CDM models at high redshift due to 
the high densities and short orbital timescales of collapsed objects^" (see Supplementary 
Information) . 

Large deviations from the local Mbh — cr relation (where a is the velocity dispersion 
of the central stellar spheroid) should be expected at high redshift as the black hole 
mass grows faster than that of its host galaxy. A black hole mass much larger than that 
inferred from the local Mbh — <^ relation has actually been suggested for the only high- 
redshift quasar for which gas kinematics has been measured^^. Substantial evolution in 
the high-redshift scaling relations between SMBHs and their host galaxies will be testable 
with future observations of bright quasar hosts with the James Webb Space Telescope 
(JWST) and the Atacama Large Millimeter Array (ALMA). At low redshift the formation 
mechanism that we propose should be suppressed as pre-existing SMBHs in the two merging 
galaxies heat the surrounding medium while they accrete gas, preventing instabilities and 
inflows in the nuclear disk. Finally, future gravitational wave experiments such as the Laser 
Interferometer Space Antenna (LISA) will be able to test the existence of a population of 
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Figure 2. Evolution of the cumulative gas mass profile of the nuclear disk. Profiles 
are normalized to the total gas mass within the radius the nuclear disk (100 pc). The disk radius 
is determined from the sharp drop in the nuclear gas density distribution. The left panel shows 
the profile at scales of tens of parsecs, while the right panel displays the profile within the inner 
few parsecs. In both panels different curves show the mass profile at 9.1 x 10'^ yr (solid line), 
7.49 X 10^ yr (dot-dashed line) and 1.036 x 10^ yr (dashed line) after the merger, these being the 
same snapshots used in Figure 1. The time of the merger is defined as the time at which the 
two density peaks associated with the merging cores of the galaxies are no longer distinguishable. 
Mass redistribution is evident as spiral arms at tens of parsecs scales push mass inward and shed 
angular momentum outwards (top panel), gradually leading to an increasing mass concentration 
in the central region (bottom panel). This triggers the Jeans collapse of the central inner few 
parsecs into a supercloud containing ~ 13% of the total disk mass, which manifests as a strong 
fiattening of the profile at parsec scales as the cloud absorbs a large fraction of the mass in that 
region 
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black holes that had a jump start by probing the mass distribution of merging black holes 
as a function of redshift . 
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SUPPLEMENTARY INFORMATION 

Here we briefly describe the setup of tlie initial conditions and the numerical methods used 
to perform the simulations presented in the Letter. This is followed by a critical discussion 
of the assumptions behind the modeling of thermodynamics in the simulations and by a 
quantitative discussion of the instability and dynamics of the nuclear disk to support the 
results described in this Letter. We conclude with a discussion of the growth of the massive 
black hole seed using analytical estimates. 

1 Numerical Methods 

1.1 The N-Body+SPH coderGASOLINE 

We have used the fully parallel, N-Body+smoothed particle hydrodynamics (SPH) code 
GASOLINE to compute the evolution of both the coUisionless and dissipative component 
in the simulations. A detailed description of the code is available in the literature^. Here 
we recall its essential features. GASOLINE computes gravitational forces using a tree- 
code^ that employs multipole expansions to approximate the gravitational acceleration on 
each particle. A tree is built with each node storing its multipole moments. Each node is 
recursively divided into smaller subvolumes until the final leaf nodes are reached. Starting 
from the root node and moving level by level toward the leaves of the tree, we obtain a 
progressively more detailed representation of the underlying mass distribution. In calcu- 
lating the force on a particle, we can tolerate a cruder representation of the more distant 
particles leading to an O(A^logA^) method. Since we only need a crude representation for 
distant mass, the concept of "computational locality" translates directly to spatial locality 
and leads to a natural domain decomposition. Time integration is carried out using the 
leapfrog method, which is a second-order symplectic integrator requiring only one costly 
force evaluation per timestep and only one copy of the physical state of the system. 

SPH is a technique of using particles to integrate fluid elements representing gas^-^. 
GASOLINE is fully Lagrangian, spatially and temporally adaptive and efficient for large 
N. It employs radiative cooling in the galaxy merger simulation used as a starting point 
for the refined simulations presented in this Letter. We use a standard cooling function 
for a primordial mixture of atomic hydrogen and helium. We shut off radiative cooling at 
temperatures below 2 x 10^ K that is about a factor of 2 higher than the temperature at 
which atomic radiative cooling would drop sharply due to the adopted cooling function. 
With this choice we take into account non-thermal, turbulent pressure to model the warm 
ISM of a real galaxy^. Unless strong shocks occur (this will be the case during the final 
stage of the merger) the gaseous disk evolves nearly isothermally since radiative cooling is 
very efficient at these densities (< 100 atoms/cm^) and temperatures (10"^ K), and thus 
dissipates rapidly the compressional heating resulting from the non-axisymmetric structures 
(spiral arms, bars) that soon develop in each galaxy as a result of self-gravity and the 
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tidal disturbance of the companion. The coohng rate would increase with the inclusion 
of metal hnes, but it has been shown (see section 1.4) that the equation of state of gas 
at these densities is still nearly isothermal (7 ~ 0.9 — 1.1) for a range of metaUicities 
(with 7 being lower for higher metallicity) , supporting the validity of our simple choice for 
the cooling function. Cooling by metals will surely be important below 10"^ K, but this 
would be irrelevant in our scheme since we have imposed a temperature floor of 2 x 10*^ 
K to account for non-thermal pressure (see above). The specific internal energy of the gas 
is integrated using the asymmetric formulation. With this formulation the total energy 
is conserved exactly (unless physical dissipation due to cooling processes is included) and 
entropy is closely conserved away from shocks, which makes it similar to alternative entropy 
integration approaches®. Dissipation in shocks is modeled using the quadratic term of the 
standard Monaghan artificial viscosity'^. The Balsara correction term is used to reduce 
unwanted shear viscosity^. The galaxy merger simulation^ includes star formation as well. 
The star formation algorithm is such that gas particles in dense, cold Jeans unstable regions 
and in convergent flows spawn star particles at a rate proportional to the local dynamical 
timc^'^°. The star formation efficiency was set to 0.1, which yields a star formation rate of 
1 — 2M0 / yr for models in isolation that have a disk gas mass and surface density comparable 
to those of the Milky Way. 

1.2 The simulations of galaxy mergers 

For the Letter we performed a refined calculation of a galaxy merger simulation between 
two identical galaxies. The initial conditions of this and other similar merger simulations 
arc described in previous papers^'^. The models are based on our knowledge of present-day 
disk galaxies simply because there is little information available from obcrvations of galaxy 
structure at high redshift. Yet, this modeling strategy should be regarded as conservative 
for the purpose of this paper since high redshift disks are denser, more gas-rich and more 
turbulent than present-day galaxies, which should favour the formation of massive nuclear 
disks and, subsequently, of large gas concentrations within them. We employed a multi- 
component galaxy model constructed using the technique originally developed in ^^'^^^ j^g 
structural parameters being consistent with the ACDM paradigm for structure formation^^. 
The model comprises a spherical and isotropic Navarro- Frenk-and- White (NFW) dark mat- 
ter (DM) halo^^'^^, an exponential disk, and a spherical, non-rotating bulge. We adopted 
parameters from the Milky Way model Al of Specifically, the DM halo has a virial 
mass of Mvir = 10^'^ Mq, a concentration parameter of c = 12, and a dimensionless spin 
parameter of A = 0.031. The mass, thickness and resulting scale length of the disk are 
Md = 0.04Mvir, Zo = 0.1i?d, and i?d = 3.5 kpc, respectively. The bulge mass and scale 
radius are Mb = O.OOSMyir and a = 0.2R^, respectively. While a stellar bulge is always 
present in the most massive disk galaxies out to 2; = 1^^ there is still insufficient knowledge 
of galactic structure at higher redshift to confirm that this is the case also at 2; > 6. How- 
ever, the presence of the bulge has the effect of stabilizing the galaxy disks against external 
perturbations^^. This implies that, without the bulge, the two disks would undergo even 
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stronger non-axisymmetric distortions during the final stage of tlie merger and initiate a 
gas inflow even more powerful that the one obtained in our current calculation. Hence 
including the bulge should be regarded as a conservative assumption for the purpose of this 
Letter. 

The DM halo was adiabatically contracted to respond to the growth of the disk and 
bulge-*^^ resulting in a model with a central total density slope close to isothermal. The 
galaxies are consistent with the stellar mass Tully-Fisher and size-mass relations. The gas 
fraction, /g, is 10% of the total disk mass. This is fairly typical for Milky Way-sized galaxies 
at low redshift but it is a conservative assumption for galaxies at 2; > 2^°. The rotation 
curve of the model is shown in Figure 1. The simulation presented in this Letter is the 
refined version of a coplanar prograde encounter. This particular choice is by no means 
special for our purpose, except that the galaxies merge slightly faster than in the other cases, 
thus minimizing the computational time invested in the expensive refined simulation. In 
particular, in previous work we have shown that the existence of a coherent nuclear disk 
after the merger is a general result that does not depend on the details of the initial orbital 
configuration, including the initial relative inclination of the two galaxies^'^. Similarly, gas 
masses and densities in the nuclear region were found to differ by less than a factor of 2 
for runs having the same initial gas mass fraction in the galaxy disks but different initial 
orbits. 

The galaxies approach each other on parabolic orbits with pericentric distances that 
were 20% of the galaxy's virial radius, typical of cosmological mergers^^ . The initial 
separation of the halo centers was twice their virial radii and their initial relative velocity 
was determined from the corresponding Keplerian orbit of two point masses. Each galaxy 
consists of 10^ stellar disk particles, 10^ bulge particles, and 10^ DM particles. The gas 
component was represented by 10^ particles. We adopted a gravitational softening of e = 0.1 
kpc for both the DM and baryonic particles of the galaxy. 

1.3 The refined simulations of the nuclear region 
1.3.1 Particle splitting 

In this Letter we use the same technique of static particle splitting that has been used before 
to study the dynamics of supermassive black hole binaries evolving in circumnuclear gaseous 
disks^^ as well as during galaxy mergers^, and to study the assembly of galaxies from the 
cooling flow in a galaxy-sized halo^^. A similar technique has been used by others to study 
the dynamics of binary black holes in spherical gaseous backgrounds^"^ . In dynamic splitting 
the mass resolution is increased during the simulation based on some criterion, such as the 
local Jeans length of the system. This requires extreme care when calculating SPH density 
or pressure at the boundary between the fine grained and the coarse grained volumes^^ . In 
static splitting, the approach is much more conservative and one simply selects a subvolume 
to refine. The simulation is then restarted with increased mass resolution just in the region 
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of interest. By selecting a large enough volume for the fine grained region one can avoid 
deahng with spurious effects at the coarse/fine boundary. We select the volume of the 
fine-grained region large enough to guarantee that the dynamical timescale of the entire 
coarse-grained region is much longer than the dynamical timescale of the refined region. In 
other words, we make sure that gas particles from the coarse region will reach the fine region 
on a timescale longer than the actual time span probed in this work. This is important 
because the more massive gas particles from the coarse region can exchange energy with 
the lower mass particles of the refined region via two-body encounters, artificially affecting 
their dynamics and thermodynamics^^. Hence our choice to split in a volume of 30 kpc 
in radius, while the two galaxy cores are separated by only 6 kpc. The new particles are 
randomly distributed according to the SPH smoothing kernel within a volume of size ~ h^, 
where hp is the smoothing length of the parent particle. The velocities of the child particles 
are equal to those of their parent particle (ensuring momentum conservation) and so is their 
temperature, while each child particle is assigned a mass equal to l/Ai'sput the mass of the 
parent particle, where A^spiit is the number of child particles per parent particle. The mass 
resolution in the gas component was originally 2 x lO^M© and becomes ~ 3000Mq after 
splitting, for a total of 1.5 million SPH particles.. The star and dark matter particles are 
not splitted to limit the computational burden. The softening of the gas particles is reduced 
to 0.1 pc (it was 100 pc in the low resolution simulations). For the new mass resolution, 
the local Jeans length is always resolved by 10 or more SPH smoothing kernels^^'^^ in the 
highest density regions occurring in the simulations. 

The softening of dark matter and star particles remains 100 pc because these are not 
splitted. Therefore in the refined simulations stars and dark matter particles essentially 
provide a smooth background potential to avoid spurious two-body heating against the 
much lighter gas particles, while the computation focuses on the gas component which 
dominates by mass in the nuclear region. By performing numerical tests we have verified 
that, owing to the fact that gas dominates the mass and dynamics of the nuclear region, 
the large softening adopted for the dark matter particles does not affect significantly the 
density profile of the inner dark halo that surrounds the nuclear disk. 

1.4 Thermodynamics of the nuclear region 

Model description In the refined simulations the gas is ideal and each gas particle obeys 
P = (7 — l)pu. The specific internal energy u evolves with time as a result of PdV work 
and shock heating modeled via the standard Monaghan artificial viscosity term (no explicit 
radiative cooling term is included). 

The entropy of the system increases as a result of shocks. Including irreversible heating 
from shocks is important in these simulations since the two galaxy cores undergo a violent 
collision. Shocks are generated even later as the nuclear, self-gravitating disk becomes 
non-axisymmetric, developing strong spiral arms. Therefore the highly dynamical regime 
modeled here is much different from that considered by previous works starting from an 
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equilibrium disk model, which could be evolved using a polytropic equation of state and 
neglecting shock heating^^'^^. Radiative coohng is not directly included in the refined 
simulations. Instead, the magnitude of the adiabatic index, namely the ratio between 
specific heats, is changed in order to mimic different degrees of dissipation in the gas 
component, thereby turning the equation of state of the gas into an "effective" equation 
of state'^^'^-*^. We have shown elsewhere^ that the transition between the radiative cooling 
regime and the effective equation of state regime does not introduce numerical artifacts in 
the simulation. 

Previous works"^^ have used a two-dimensional radiative transfer code to study the 
effective equation of state of interstellar clouds exposed to the intense UV radiation field 
expected in a starburst finding that the gas has an adiabatic index 7 in the range 1.1 — 1.4(= 
7/5) for densities in the range 5 x 10^ — 10^ atoms/cm^, comparable to the volume-weighted 
mean density in the simulated nuclear disk. Such values of the adiabatic index are expected 
for quite a range of starburst intensities, from 10 Mq/jt to more than 100 Mq/jt, hence 
encompassing the peak star formation rate of ~ 3OM0/yr measured in the original low- res 
galaxy merger simulations^. Hence under these conditions the nuclear gas is not isothermal 
(7 = 1), which would correspond to radiative cooling being so efficient to balance heating 
coming from compression and/or radiative processes, as it happens in the first stage of the 
simulation. Its inefficient cooling at densities of lO'^ atoms/cm"^ is mostly due to a high 
optical depth which causes trapping in H2O and CO lines. In addition the warm dust heated 
by the starburst continuously heats the gas via dust-gas collisions, and the cosmic-rays do 
so as welP^. We adopt 7 = 7/5 until the first gas infiow is completed in the simulation 
(we have also run a case for 7 = 1.3 and found that the structure of the nuclear disk 
is substantially unchanged). That the mean properties (mass, density, pressure support 
contributed by the thermal and turbulent components, rotational velocity) of nuclear disks 
formed in galaxy mergcrs"^^ and simulated with such an effective equation of state compare 
well with the corresponding properties of nuclear disks observed in detailed observations 
of merger remnants has been already shown in previous work^. For densities above 10^ 
atoms/cm^ cooling is more efficient and 7 should drop to ~ 1.1 according to the adopted 
EOS model^^. This condition is verified in the central few parsecs after the first infiow 
(~ 10^ yr after the merger), hence 7 = 1.1 is adopted from this time onward. By comparing 
with another simulation that was continued with 7 = 1.4 we verified that the dynamics of 
the disks at larger scales (tens of parsec scales) are not affected by varying 7. 

With our EOS we model the nuclear gas as a one-phase medium. In reality the nuclear 
disks will have a complex multi-phase structure with temperatures and densities spanning 
orders of magnitude even in localized regions, as shown by detailed numerical calculations 
of nuclear disk models^^'^^. In particular, even the same radius the disk may have regions 
with different densities that may evolve as if the equation of state was varying locally. 
Furthermore, based on the assumed EOS modeF^ the lowest density (< 10"^ atoms/cm^) 
and highest density gas (> 10^ atoms/cm^) is characterized by an adiabatic index 7 < 1, 
hence by a lower sound speed Vs = J^yksT/ij,. A lower gas sound speed will make the disk 
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more gravitationally unstable, lowering the Toomre Q parameter, and thus more prone to 
fragmentation and star formation^^. However, as the Toomre parameter is lowered stronger 
asymmmetries and spiral modes will also occur, increasing the gas inflow. The question is 
then whether the net amount of mass transported to the center will also increase or will 
be reduced by the competing effect of star formation. Recent numerical work'^^ has shown 
that even with specific star formation rates 5 times higher than expected based on the 
Kennicutt-Schmidt law gas inflows are still prominent and lead to a significant deposition 
of mass to the center of an isolated rotating nuclear gas disk model, strongly suggesting 
that the results presented in this Letter will remain valid even in a more realistic calculation 
incorporating star formation and local variations of the effective equation of state. 

Steir formation and feedback The conversion of gas into stars is not included in the 
refined calculation with particle splitting presented in this Letter, although its (important) 
radiative feedback effect on gas thermodynamics is included with the choice of the EOS 
(see previous subsection). Indeed the refined calculation is carried out for less than 1% of 
the starburst duration of 10® yr indicated by the companion merger simulation carried out 
without splitting®'^, suggesting that the conversion of gas into stars is not important on the 
timescales of interest for this work. Cautionary remarks regarding this point are however 
necessary. Had we included star formation directly in the refined simulations we would 
have probably found local gas consumption timescales shorter than the global starburst 
timescale since with splitting much higher densities are resolved and the star formation 
rate depends on the local gas density. As for the first issue, we can obtain a rough estimate 
of how short the star formation timescale can be in the following way. To begin with, in 
the nuclear disk most of the gas is at densities above 100 atoms/cm^. At these densities 
molecular hydrogen formation is efficient Let us then make the extreme assumption 
that all the gas in the disk is molecular and readily available for star formation. Then, let 
us simply assume that molecular gas will be turned into stars on the local orbital timescale. 
Star formation in molecular clouds is rather inefficient, and typically < 30% of the dense, 
molecular gas is converted into stars, possibly because internal turbulence in the clouds 
prevents them from collapsing altogether Therefore let us write the star formation rate 
in the nuclear disk as a whole as dM^/dt = 0.3 x Mgas/Torb, where Tort = 2.5 x 10^ years, the 
orbital time at the disk half mass radius, 25 pc, and Mgas = 3 x IO^Mq is the mass of the 
enclosed mass of the disk at such distance. The resulting upper limit on the star formation 
rate is 36OOM0/yr , about 80 times higher than that estimated in the low- res simulations. 
Nonetheless, even with such high star formation rate only 10% of the gas within 25-30 pc 
(this is also the region where the first inflow occurs - see next section and Figure 1) would 
be converted into stars during the time required to form the central supermassive cloud 
(~ 10^ years). On the other end, this high star formation rate justifies the use of the EOS 
valid for a starburst regime even on the short timescales probed by our calculations. These 
simple estimates suggest that our results should be qualitatively valid in general. 
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Figure 3. Time evolution of the gas inflow rate (top panel) and of the amplitude of the strongest 
non-axisymmetric mode in the disk (bottom panel, m = 2 is the strongest mode at all times). 
Azimuthally averaged radial profiles are shown in both cases, time averaged over a few outputs 
around a chosen time to avoid selecting transients. From left to right we show profiles at 
t* = 9.1 X 10'^ yr (a), 7.49 x 10^ yr (b) and 1.036 x 10^ yr (c) after the merger, these being the 
same snapshots used in Figure 1 and 2 of the main paper. The time of the first snapshot (a) was 
indeed chosen near the maximum peak of the m = 2 mode, which also corresponds to a maximum 
inflow rate at scales of ~ 10 — 20 pc. At time (b) the Jeans collapse has already started at parsec 
scales, as shown by the very large inflow rate, which is indeed the highest measured throughout 
the simulation. At larger (~ 10 pc) scales the inflow has instead decreased because the dominant 
global m = 2 mode has also weakened considerably (bottom panel). At time (c) the m = 2 
mode has faded considerably (higher order modes are now shown but they are even weaker at 
this time), the inner region is stable while outside the central parsec a net outflow is seen, which 
is responsible for the spreading of the disk highlighted in Figure 2 of the main paper. 
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Nonetheless, the arguments just outhned are based on simple scaling laws and, in par- 
ticular, neglect the multi-phase nature of the ISM in the nuclear disks, that is important 
for star formation and is not accounted for in our EOS model. Furthermore, they are based 
on using average properties in the disk. In principle at radii of order a parsec or less higher 
densities and shorter dynamical times would yield higher star formation rates, potentially 
lowering the gas mass available to the central cloud. How much of the cloud mass would 
be lost to stars will also depend on the balance of heating and cooling within the cloud, 
which is beyond the reach of our resolution. However, as explained in the letter, the high 
temperature of the cloud, T > 10^ K, and the fact that further collapse should occur nearly 
isothermally as 7 approaches unity at even higher densities, thus preserving such high tem- 
peratures, argue against the importance of star formation within the cloud. Nonetheless, 
the thermodynamics in the interior of the cloud in the later stages of the collapse is not 
necessarily captured by our EOS model (see "^^ on the importance of the form of the EOS 
for fragmentation of gas clouds). For example, if a quasi-star forms at scales well below 
our resolution as a result of the collapse, only a small fraction of its mass would collapse 
instantaneously into a black hole, while the rest will be accreted at super-Eddington rates 
from the gaseous envelope of the quasi-star on timescales of 10^ yr^^ In this case there will 
be time for the rapidly growing seed black hole to affect the surrounding gas via radiative 
feedback, which would translate in a modification of the gas EOS and in a reduced effi- 
ciency in the growth of the massive seed. This is why we decided to be conservative and 
assumed that only 0.1% of the central gas cloud would end up into a black hole. 

Along with star formation, another astrophysical aspect that we include in a very sim- 
plified way is feedback from star formation. Radiative feedback from stars is implicitly 
included in our choice of the effective equation of state (sec above), but feedback from 
supernovae explosions is not taken into account. However, feedback from supernovae type 
II, would contribute to both heating the gas and increasing its turbulence (the timcscalc of 
supernovae type II explosions is sufficiently short to be relevant here), which should go in 
the direction of decreasing the star formation rate and therefore strengthening our previous 
argument concerning the role of star formation. It would also tend to stiffen the equation 
of state even in the central, high density regions, likely bringing our "mean" effective 7 in 
closer agreement to the values adopted here. On the other end, a quick calculation suggests 
that feedback should not have a major impact on the global energetics of the nuclear disk. 
In fact, assuming the upper limit on the star formation rate of 3600MQ/yr and a Miller- 
Scalo initial stellar mass function we obtain that supernovae should damp < 2.5 x 10^^ 
erg/yr (7 x 10^^ erg per solar mass of stars formed) into the surrounding gas, corresponding 
to < 2.5 X 10^^ erg damped during the black hole formation timescale, ~ 10^ yr. This is at 
most 20% of the the overall internal energy budget of the gas in the nuclear disk (the sum 
of turbulent, rotational and thermal energy), hence the effects on thermodynamics will be 
moderate. In addition most of this energy will be deposited outside the inner few parsecs 
since there the strong radial inflow will dominate over star formation because there the 
inflow timescale is shorter than the orbital timescale on which star formation proceeds. As 
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a result, the dynamics of the gas inflow, that are crucial in our formation model, should 
not be significantly affected. 

Feedback from the accreting massive seed black hole Finally, once the black hole 
forms it will start accreting and will lose part of the accretion energy either radiatively or 
in part radiatively and in part in form of mechanical energy if a powerful jet is produced 
as observed in some of the local and distant AGNs. This is the so called "AGN feedback", 
whose effect we have not taken into account in the simple calculation of the timescale for 
the black hole to grow up to IO^A/q . Indeed, based on recent calculations*^'^, a SMBH 
starting with IO^Mq should be able to accrete enough mass hj z = 6 even when the 
radiative feedback from black hole accretion (AGN feedback) is accounted for. Note that 
hydrodynamical effects and momentum deposition due to jets would deserve a separate 
discussion but are currently not well understood. 

At low redshift two accreting SMBHs should be already present in the galaxies as they 
merge. Heating of the surrounding gas by "AGN feedback" might be strong enough to 
overcome cooling and produce a stiff er equation of state in the nuclear gas^. In this case, 
either a dense nuclear disk will not form at alP, and so no multi-scale scale gas inffows will 
occur, or the higher pressure support of the gas within the nuclear disk could prevent the 
secondary inffow and thus the formation of the central supermassive cloud. 

2 Mass transport and stability of the nucleetr region 

We have measured the strength of the non-axisymmetric modes in the nuclear disk using 
a Fourier decomposition in order to estabhsh a clear correlation between the regions of the 
disk at which the maximum inffow occurs in the nuclear disk and the amplitude of the 
strongest mode. This is shown in the panels of Figure 1. The strongest mode in the disk 
is at all times a two-armed spiral, corresponding to m = 2 in the Fourier decomposition, 
as also apparent in Figure 1 (top panel) of the main paper. Such mode is the imprint 
of the collision between the two galaxy cores. The inffow rate is remarkable, peaking at 
> lO^Mg/yr, which corresponds to radial velocities of about 100 km/s. This is sustained 
for only a few 10^ yr, allowing to bring a few IO^Mq of gas within the inner 10 pc. Note 
that the large radial velocities arc of order of the turbulent velocities seen in nuclear disks 
residing at the center of merger remnants^^. In Figure 1 we also show the second radial 
inffow triggered by the onset of the Jeans collapse of the gas within the central parsec. The 
collapse begins at about a parsec scale as soon as the enclosed mass chmbs above the local 
Jeans mass (~ 7 x lO^M© at r = 1 pc, note that using the Bonnor-Ebert mass would yield 
essentially the same result within a factor ~ 2). Gas at about a parsec scale rotates at a 
speed of about Vrot ~ 600 km/s but it is pressure supported as the temperature raises close 
to 10® K at such scales owing to adiabatic compression {vs ~ 1000 km/s > Vrot)- The fact 
that pressure provides the most important support against gravity justiffes our use of the 
Jeans mass to characterize the phase of collapse, although a more complete description of 
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the process would involve accounting for the effect of rotation and the continued mass flux 
from the outer region of the disk. 

We note that, owing to the high mass resolution reached after applying particle splitting, 
the local Jeans length is always well resolved throughout the disk, in the sense that it is 
about an order of magnitude larger than either the local SPH smoothing length or the 
gravitational softening^^. The smoothing length is comparable with the softening at parsec 
scales, but becomes nearly an order of magnitude smaller than the latter when approaching 
a fraction of a parsec, suggesting that, if anything, the collapse may be slowed down once 
the cloud contracts to a size of order the softening. This implies that our conclusion 
that the inner supermassive cloud will continue to collapse further should be regarded as 
conservative; with greater resolution, indeed, not only the collapse should continue but 
should likely be faster. 

The right snapshots of Figure 1 show the final expansion of the disk as the spiral 
arms unwind and transfer angular momentum outward of tens of parsecs, generating a 
net outflow. At this point the inner profile has reached stability as further collapse is not 
possible once the resolution limit is reached. Expansion of the disk as a result of angular 
momentum transfer driven by spiral modes is a well documented phenomenon in both 
gaseous and stellar disks from galactic to planetary scales^^'^^. 

In Figure 2 we show the evolution of the Toomre parameter, which strictly measures 
the local stability of a differentially rotating disk to axisymmetric perturbations^^. It is im- 
portant to note that the disk is born out of equilibrium from the collision of the two cores, 
rather than becoming unstable starting from an equilibrium rotational configuration as as- 
sumed in the standard perturbative approach. The disk indeed reaches a near-equilibrium 
configuration after the phase of intense inflows studied in this paper. Nevertheless, pre- 
vious studies on gaseous disks have shown that the Toomre parameter provides a good 
empirical measure of the susceptibility of disks to fragmentation quite irrespective of how 
the disk is initially set up, and in this sense also applies well to global stability to generic, 
non-axisymmetric perturbations^^. The Figure shows that the Toomre parameter remains 
always in the theoretical regime of stability against fragmentation, although it drops ini- 
tially to values in the range 1 — 1.5 where strong spiral instabilities are expected and are 
indeed observed. After the phase of strong non-axisymmetric instability associated with 
the inflow and subsequent central collapse is terminated (first and second panel) the disk 
self- regulates to a more stable state, with a minimum Q ~ 2. 

It is instructive to compare the very high inflow rates measured in the nuclear disk with 
the expectations of analytical models that study a self-gravitating, thin isothermal accretion 
disk in steady state. Note that such models are based on a local stability analysis, in 
contrast with the global character of instability in our nuclear disk. Under such assumption, 
recent works argue that there exists maximum inflow rate in such a disk above which 
fragmentation, and thus star formation, will occur In steady state, the maximum 
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inflow rate can be expressed as Mmax = 2q;^, where a ~ 0.06 is the maximum disk 
viscosity, resulting from gravitational stresses, G is the gravitational constant, and Vg is the 
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sound speed. They consider protogalactic disks with temperatures ~ 4000 K (v^ ~ 5 km/s) 
for which Mmax — 10~'^MQ/yr. In our disks the thermal sound speed is much higher, but 
because the disks are in a gravoturbulent state, it is more sensible to consider the sum of 
the thermal sound speed and turbulent velocity dispersion, as we have done for the Toomre 
Q parameter (see Figure 2). This amounts to about 600 — 700 km/s at scales of 25 pc 
(the scale of the first inflow - see Figure 1), which implies an increase of a factor up to 
140^ = 2.7 X 10^ which would yield a maximum inflow rate ~ lO^M0/yr, quite in agreement 
with the results shown in Figure 1 (first panel) . The second inflow at parsec scales is even 
higher and seems to overshoot the maximum rate possible without fragmentation, but 
since this is triggered by the Jeans collapse of the central cloud rather than by transport by 
spiral waves (see Figure 1) the argument based on the maximum viscous stress associated 
with gravitational instability does not apply anymore. In addition, the assumption of 
steady state, which is not appropriate in general under the highly dynamical conditions 
of our nuclear disk, clearly does not apply once the central cloud begins to collapse. As 
anticipated above, the analytical approach just outlined also stems from a local stability 
analysis of self-gravitating accretion disks^^, while the gravoturbulent state of the nuclear 
disk is inherently related to the global character of the non-axisymmetric instability; in 
globally unstable disks "effective" a viscosities even larger than unity can easily arise^^. 
Dropping the constraint a ~ 0.06, as suggested by the nature of global instability, would 
allow even higher inflow rates without fragmentation. 

In any case, our simple analysis shows how a hot, gravoturbulent disk can easily sustain 
accretion rates orders of magnitude higher than those of cold, non-fragmenting protogalactic 
disks considered in recent literature, hence allowing naturally the formation of central 
supermassive objects without drainage of gas by star formation. These particular conditions 
arise naturally in the highly perturbed nuclear disk emerging from a galaxy merger, and 
should indeed be interpreted as a by product of the starburst itself. Thus, in our model 
star formation has indeed a positive role in allowing the direct collapse since it provides 
enough heating to prevent large-scale fragmentation. 

Finally, typical galaxies at high redshift would have larger reservoirs of gas to form and 
feed the black hole relative to the initial conditions adopted in this Letter. Their more com- 
pact galactic disks (disk sizes are expected to scale as (H- z)~^^^ as their embedding dark 
halos in the CDM cosmogony, see also next section) would have produced more compact 
and dense gaseous nuclei after the merger, whose shorter dynamical timescales would drive 
an even faster mass transport towards the center via gravitational instabilities. Hence the 
results presented here should be regarded as conservative for the purpose of our scenario. 
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Figure 4. Azimuthally averaged Toomre Q profile of the disk shown at the same times as in 
Figure 1 (time increasing from left to right). The shaded area corresponds to the instability region 
marked hy Q < 0.67, this being the stability threshold for finite thickness disks^^. The Toomre 
parameter is calculated as Q = hVs,/t^GTi, where Vg, is the effective sound speed including the 
contribution of the turbulent (radial) velocity dispersion, Vs, = \J {vg^ + 0"^^), where Vg is the 
thermal sound speed and ar the radial velocity dispersion of the gas. Note that Ur ^ Vg outside 
the central few parsecs; within the inner few parsecs, especially after the onset of the Jeans collapse 
(panels b - c) the thermal pressure is dominant as the gas is strongly adiabatically compressed 
(the high central value of Q seen in panel (c) is associated with the formation of the supermassive 
hot central cloud). 
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3 Growth of the massive black hole seed 

The time ^(Mbh) required for a black hole of initial mass mgeed to reach a mass Mbh assum- 
ing Eddington limited accretion^''' is given by t(MBH) — rx fedd~^ x (j^) ^ ln(MBH/?7iseed), 
where Mbh = lO^M©, niseed = lO^M©, f^dd expresses at which fraction of the Eddington 
limit the black hole is accreting, and e,. is the radiative efficiency. In our case Mbh = lO^M© 
and msccd = 2.6 x IO^Mq (corresponding to 0.1% of the mass of the collapsing central cloud 
in our simulation). Furthermore, we choose standard parameters f^dd = 1, e,. = 0.1 and, to 
compute the characteristic accretion timescale r, we adopt a molecular weight per electron 
for a plasma at zero metallicity with cosmic abundance of hydrogen {X — 0.75) and helium 
(y = 0.25), /ie = 1/(1 - yf^) = 1-14, so that r = OAb^ie'^ = 0.395 Gyr. Note that 
metallicity effects are marginal in this calculation because they only have a negligible effect 
on the value of the molecular weight With these choices, we obtain t(MBH) = 0.362 Gyr. 
Assuming that the seed black hole can accrete at the Eddington limit {fedd = 1) is justified 
by the fact that the hole would accrete the gas belonging to the nuclear disk, which has 
very high densities uh ~ 10^ — 10^ atoms/cm^ at scales below 10 pc. Such densities are 
several orders of magnitude higher than e.g. those of the gas surrounding the small black 
hole seeds formed by the collapse of Pop 111 stars, which accrete at sub-Eddington rates^^. 
Therefore, not only our model can lead to seeds that are much more massive relative to 
those resulting from Pop 111 stars, but also the following gas accretion and growth occurs 
in a much more favourable environment. 

Although we have adopted standard values of the parameters to calculate ^(Mbh) we 
can ask how much these can be varied while still being able to reach a billion solar masses 
within the first billion years. The timescale, due to its functional form, is especially sensitive 
to the radiative efficiency. Theoretically, the actual value of depends on the spin of the 
black hole, which in turn depends on the uncertain mechanism of accretion, such as if the 
accretion occurs with or without magnetohydrodynamical (MHD) cffccts'^^. It may be as 
large as 0.42 for maximally spinning black holes^^ accreting in standard thin disks , in 
which case ^(Mbh) would exceed 1 Gyr. The timescale would instead be ~ 0.8 Gyr for 
er — 0.2, as possible in MHD disks. It is important to stress that values of in the range 
0.1 — 0.2 should be regarded as more realistic since they are independently inferred from 
the ratio R of the QSO plus AGN luminosity density to the mass density of SMBHs in 
nearby galaxies^°. In summary, t(MBH) < 1 Gyr requires < 0.2 and f^dd ^ 0.7. On the 
other end, if the galaxy merger occurs at 2; ~ 8, and the black hole has to grow to its final 
mass by 2; ~ 6, the time available for accretion is < 0.5 Gyr for the standard WMAP5 
cosmology, which then favours our standard choice of parameters fedd — 1 and = 0.1. 

As highlighted in the Letter, for the SMBHs to grow as required in the short time 
available it is also necessary that the merger timescale is significantly shorter than a billion 
year. Using simple scaling relations for galaxies and their halos in CDM models^^ one 
can show that, at a fixed virial mass M^r, these are both smaller and denser compared 
to z = 0. This refiects the fact that the Universe itself is denser at higher redshift. This 
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reduces significantly the typical orbital time between pairs of collapsed objects, and thus 
the merging timescale, relative to 2; = 0. The orbital time during the merger (assuming 
for simplicity a circular obit) is of order 2TrRyir/Vc, where the halo circular velocity Vc 
and the virial radius Ryir at fixed Myir increase and decrease, respectively, with increasing 
redshift. Using the standard scaling relations of Mo, Mao & White^° for such quantities 
one can show that Torb ~ (1 + z)'^^"^. Since at 2; = the merging timescale on a typical 
cosmological orbit is ~ 5 Gyr^, it follows that at z = 8 this drops to ~ 0.2 Gyr, which is 
comfortably shorter than a billion year. 

Finally, in the conventional model in which light seed black holes originate from the 
collapse of Pop III stars their growth can be hampered if they are kicked out after merging 
with another hole due to the "gravitational rocket" effect^^. This phenomenon can have an 
important, negative impact on the growth of large black holes^^. In our direct formation 
model two obvious scenarios are possible; cither the black hole grows in a galaxy with 
no pre-existing hole, as we assumed throughout the paper, or it grows in galaxy where a 
light "Pop III seed" hole is also present (this would have remained light due to inefficient 
accretion). In the first case the gravitational rocket is not important. In the second case, 
the lighter black hole could be kicked out of the galaxy, but since it would be orders of 
magnitude less massive it will not have an impact on the final mass growth of the primary 
black hole formed by direct collapse. In a third, less trivial scenario, the galaxy merger 
remnant could merge with a third, similarly massive galaxy in which another black hole 
has grown to a similarly large mass, after forming by direct collapse during a previous 
merger. In this case the gravitational rocket may be important since the two holes may 
have a similar (large) mass when they merge at the center of the remnant. However, the 
probability that this may happen before each black hole has reached a billion solar masses 
is quite low, since the merger timescale at 2; = 6 — 8 is not smaller than the characteristic 
growth timescale of the black hole calculated with our standard parameters, rather is of the 
same order (see main paper). We conclude that, in general, kicks due to the gravitational 
rocket should not affect the conclusions of our work. 
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